Skip to content

Conversation

@mahadzaryab1
Copy link
Collaborator

@mahadzaryab1 mahadzaryab1 commented Jan 1, 2026

Which problem is this PR solving?

Description of the changes

  • This PR reworks the ClickHouse reader to look up the type for string attributes. This needs to be done because the query APIs only accept string values for tag filters, so we need to look up their actual type to query ClickHouse

How was this change tested?

  • Updated unit tests
  • Integration tests pass

Checklist

@codecov
Copy link

codecov bot commented Jan 1, 2026

Codecov Report

❌ Patch coverage is 96.95122% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 95.51%. Comparing base (afdf1c3) to head (f4a5064).
⚠️ Report is 6 commits behind head on main.

Files with missing lines Patch % Lines
...nternal/storage/v2/clickhouse/tracestore/reader.go 96.26% 3 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7815      +/-   ##
==========================================
+ Coverage   95.44%   95.51%   +0.07%     
==========================================
  Files         305      312       +7     
  Lines       16192    16601     +409     
==========================================
+ Hits        15454    15857     +403     
- Misses        577      579       +2     
- Partials      161      165       +4     
Flag Coverage Δ
badger_v1 9.09% <0.00%> (-0.13%) ⬇️
badger_v2 1.91% <0.00%> (-0.03%) ⬇️
cassandra-4.x-v1-manual 13.46% <0.00%> (-0.19%) ⬇️
cassandra-4.x-v2-auto 1.90% <0.00%> (-0.03%) ⬇️
cassandra-4.x-v2-manual 1.90% <0.00%> (-0.03%) ⬇️
cassandra-5.x-v1-manual 13.46% <0.00%> (-0.19%) ⬇️
cassandra-5.x-v2-auto 1.90% <0.00%> (-0.03%) ⬇️
cassandra-5.x-v2-manual 1.90% <0.00%> (-0.03%) ⬇️
clickhouse 1.99% <0.00%> (+0.01%) ⬆️
elasticsearch-6.x-v1 17.37% <0.00%> (-0.24%) ⬇️
elasticsearch-7.x-v1 17.40% <0.00%> (-0.24%) ⬇️
elasticsearch-8.x-v1 17.55% <0.00%> (-0.25%) ⬇️
elasticsearch-8.x-v2 1.91% <0.00%> (-0.03%) ⬇️
elasticsearch-9.x-v2 1.91% <0.00%> (-0.03%) ⬇️
grpc_v1 8.49% <0.00%> (-0.12%) ⬇️
grpc_v2 1.91% <0.00%> (-0.03%) ⬇️
kafka-3.x-v2 1.91% <0.00%> (-0.03%) ⬇️
memory_v2 1.91% <0.00%> (-0.03%) ⬇️
opensearch-1.x-v1 17.44% <0.00%> (-0.24%) ⬇️
opensearch-2.x-v1 17.44% <0.00%> (-0.24%) ⬇️
opensearch-2.x-v2 1.91% <0.00%> (-0.03%) ⬇️
opensearch-3.x-v2 1.91% <0.00%> (-0.03%) ⬇️
query 1.91% <0.00%> (-0.03%) ⬇️
tailsampling-processor 0.55% <0.00%> (-0.01%) ⬇️
unittests 94.20% <96.95%> (+0.10%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: Mahad Zaryab <[email protected]>
@mahadzaryab1 mahadzaryab1 added the changelog:experimental Change to an experimental part of the code label Jan 1, 2026
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
@github-actions
Copy link

github-actions bot commented Jan 1, 2026

Metrics Comparison Summary

Total changes across all snapshots: 0

Detailed changes per snapshot

summary_metrics_snapshot_cassandra

📊 Metrics Diff Summary

Total Changes: 0

  • 🆕 Added: 0 metrics
  • ❌ Removed: 0 metrics
  • 🔄 Modified: 0 metrics
  • 🚫 Excluded: 53 metrics

summary_metrics_snapshot_cassandra

📊 Metrics Diff Summary

Total Changes: 0

  • 🆕 Added: 0 metrics
  • ❌ Removed: 0 metrics
  • 🔄 Modified: 0 metrics
  • 🚫 Excluded: 53 metrics

summary_metrics_snapshot_cassandra

📊 Metrics Diff Summary

Total Changes: 0

  • 🆕 Added: 0 metrics
  • ❌ Removed: 0 metrics
  • 🔄 Modified: 0 metrics
  • 🚫 Excluded: 106 metrics

➡️ View full metrics file


// getAttributeMetadata retrieves the types stored in ClickHouse for string attributes.
//
// The query service forwards all attribute filters as strings (via AsString()), regardless
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is that true? iirc the UI sends the attributes as JSON, which mean at least numeric types might be coming not as strings, but then there is another translation layer from HTTL handler into AttributeMap in the QueryService

for key, attr := range query.Attributes.All() {
var attrType string
var val any
attributeMetadata, err := r.getAttributeMetadata(query.Attributes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we will need to introduce some cache TTL parameter such that we don't keep running this query on every other query. The metadata doesn't change frequently.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this cache TTL parameter be exposed in the config? I can do this in following a PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it should be configurable but with sensible default

Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
yurishkuro pushed a commit that referenced this pull request Jan 9, 2026
)

## Which problem is this PR solving?
- Towards #7134 

## Description of the changes
- Update the ClickHouse test driver which is used in unit tests to
handle multiple queries. This is done in preparation for #7815 which
adds an additional query to the attribute_metadata table.

## How was this change tested?
- CI and unit tests

## Checklist
- [x] I have read
https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md
- [x] I have signed all commits
- [x] I have added unit tests for the new functionality
- [x] I have run lint and test steps successfully
  - for `jaeger`: `make lint test`
  - for `jaeger-ui`: `npm run lint` and `npm run test`

---------

Signed-off-by: Mahad Zaryab <[email protected]>
func (r *Reader) buildStringAttributeCondition(q *strings.Builder, args *[]any, key string, attr pcommon.Value, metadata attributeMetadata) error {
levelTypes, ok := metadata[key]

// if no metadata found, assume string type
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yurishkuro what're your thoughts on this?

Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
@mahadzaryab1 mahadzaryab1 changed the title [WIP] Rework ClickHouse Attributes to look up type for string attributes [clickhouse] Rework ClickHouse Attributes to look up type for string attributes Jan 19, 2026
@mahadzaryab1 mahadzaryab1 marked this pull request as ready for review January 19, 2026 04:42
@mahadzaryab1 mahadzaryab1 requested a review from a team as a code owner January 19, 2026 04:42
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reworks the ClickHouse reader to support querying attributes that are provided as strings but may be stored with different types in ClickHouse. The query APIs only accept string values for tag filters, so the implementation looks up the actual stored types from an attribute_metadata table to correctly query the typed columns.

Changes:

  • Added attribute metadata model and querying functionality to determine stored types for string attributes
  • Refactored attribute condition building into separate functions for each type
  • Updated integration tests to reflect newly enabled tag-based query tests

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
internal/storage/v2/clickhouse/tracestore/dbmodel/attribute_metadata.go New model representing attribute metadata from ClickHouse
internal/storage/v2/clickhouse/tracestore/attribute_metadata.go Implementation of metadata retrieval and query building
internal/storage/v2/clickhouse/tracestore/attribute_metadata_test.go Tests for attribute metadata error cases
internal/storage/v2/clickhouse/tracestore/reader.go Refactored attribute filtering to look up types for string attributes and build typed conditions
internal/storage/v2/clickhouse/tracestore/reader_test.go Updated tests to include attribute metadata queries and new test cases
internal/storage/v2/clickhouse/sql/queries.go Added SelectAttributeMetadata SQL query
cmd/jaeger/internal/integration/clickhouse_test.go Updated skip list to enable previously failing tag-based tests
internal/storage/v2/clickhouse/tracestore/snapshots/*.sql SQL query snapshots for test verification

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +464 to +468
decoded, err := base64.StdEncoding.DecodeString(attr.Str())
if err != nil {
return fmt.Errorf("failed to decode bytes attribute %q: %w", key, err)
}
val = string(decoded)
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bytes attribute handling is incorrect. After decoding the base64 string to bytes, the code converts the raw bytes to a string using string(decoded), which can produce invalid UTF-8 and doesn't match the storage format. The database stores bytes attributes as base64-encoded strings (see line 377 in buildBytesAttributeCondition). The decoded bytes should be re-encoded to base64 before storing in val, or simply use the original base64 string directly without decoding.

Suggested change
decoded, err := base64.StdEncoding.DecodeString(attr.Str())
if err != nil {
return fmt.Errorf("failed to decode bytes attribute %q: %w", key, err)
}
val = string(decoded)
// Bytes attributes are stored in the database as base64-encoded strings.
// Use the original base64 string directly to match the storage format.
val = attr.Str()

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,22 @@
// Copyright (c) 2026 The Jaeger Authors.
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The copyright year is set to 2026, but the current year is 2025. This should be corrected to 2025.

Suggested change
// Copyright (c) 2026 The Jaeger Authors.
// Copyright (c) 2025 The Jaeger Authors.

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,84 @@
// Copyright (c) 2026 The Jaeger Authors.
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The copyright year is set to 2026, but the current year is 2025. This should be corrected to 2025.

Suggested change
// Copyright (c) 2026 The Jaeger Authors.
// Copyright (c) 2025 The Jaeger Authors.

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,69 @@
// Copyright (c) 2026 The Jaeger Authors.
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The copyright year is set to 2026, but the current year is 2025. This should be corrected to 2025.

Suggested change
// Copyright (c) 2026 The Jaeger Authors.
// Copyright (c) 2025 The Jaeger Authors.

Copilot uses AI. Check for mistakes.
Comment on lines +354 to +408
func buildBoolAttributeCondition(q *strings.Builder, args *[]any, key string, attr pcommon.Value) {
q.WriteString("arrayExists((key, value) -> key = ? AND value = ?, s.bool_attributes.key, s.bool_attributes.value)")
q.WriteString(" OR ")
q.WriteString("arrayExists((key, value) -> key = ? AND value = ?, s.resource_bool_attributes.key, s.resource_bool_attributes.value)")
*args = append(*args, key, attr.Bool(), key, attr.Bool())
}

func buildDoubleAttributeCondition(q *strings.Builder, args *[]any, key string, attr pcommon.Value) {
q.WriteString("arrayExists((key, value) -> key = ? AND value = ?, s.double_attributes.key, s.double_attributes.value)")
q.WriteString(" OR ")
q.WriteString("arrayExists((key, value) -> key = ? AND value = ?, s.resource_double_attributes.key, s.resource_double_attributes.value)")
*args = append(*args, key, attr.Double(), key, attr.Double())
}

func buildIntAttributeCondition(q *strings.Builder, args *[]any, key string, attr pcommon.Value) {
q.WriteString("arrayExists((key, value) -> key = ? AND value = ?, s.int_attributes.key, s.int_attributes.value)")
q.WriteString(" OR ")
q.WriteString("arrayExists((key, value) -> key = ? AND value = ?, s.resource_int_attributes.key, s.resource_int_attributes.value)")
*args = append(*args, key, attr.Int(), key, attr.Int())
}

func buildBytesAttributeCondition(q *strings.Builder, args *[]any, key string, attr pcommon.Value) {
attrKey := "@bytes@" + key
val := base64.StdEncoding.EncodeToString(attr.Bytes().AsRaw())
q.WriteString("arrayExists((key, value) -> key = ? AND value = ?, s.complex_attributes.key, s.complex_attributes.value)")
q.WriteString(" OR ")
q.WriteString("arrayExists((key, value) -> key = ? AND value = ?, s.resource_complex_attributes.key, s.resource_complex_attributes.value)")
*args = append(*args, attrKey, val, attrKey, val)
}

func buildSliceAttributeCondition(q *strings.Builder, args *[]any, key string, attr pcommon.Value) error {
attrKey := "@slice@" + key
b, err := marshalValueForQuery(attr)
if err != nil {
return fmt.Errorf("failed to marshal slice attribute %q: %w", key, err)
}
q.WriteString("arrayExists((key, value) -> key = ? AND value = ?, s.complex_attributes.key, s.complex_attributes.value)")
q.WriteString(" OR ")
q.WriteString("arrayExists((key, value) -> key = ? AND value = ?, s.resource_complex_attributes.key, s.resource_complex_attributes.value)")
*args = append(*args, attrKey, b, attrKey, b)
return nil
}

func buildMapAttributeCondition(q *strings.Builder, args *[]any, key string, attr pcommon.Value) error {
attrKey := "@map@" + key
b, err := marshalValueForQuery(attr)
if err != nil {
return fmt.Errorf("failed to marshal map attribute %q: %w", key, err)
}
q.WriteString("arrayExists((key, value) -> key = ? AND value = ?, s.complex_attributes.key, s.complex_attributes.value)")
q.WriteString(" OR ")
q.WriteString("arrayExists((key, value) -> key = ? AND value = ?, s.resource_complex_attributes.key, s.resource_complex_attributes.value)")
*args = append(*args, attrKey, b, attrKey, b)
return nil
}
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The non-string attribute condition builders (buildBoolAttributeCondition, buildDoubleAttributeCondition, buildIntAttributeCondition, buildBytesAttributeCondition, buildSliceAttributeCondition, buildMapAttributeCondition) only check span and resource levels, but not scope level. However, buildStringAttributeCondition supports scope level attributes (lines 482-489). This creates an inconsistency where scope-level attributes can only be queried if they arrive as strings and have metadata. Consider adding scope-level support to all attribute type conditions for consistency.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/storage changelog:experimental Change to an experimental part of the code enhancement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants